Integration of context-dependent durational knowledge into HMM-based speech recognition

نویسندگان

  • Xue Wang
  • Louis ten Bosch
  • Louis C. W. Pols
چکیده

2. DPDF OF STANDARD HMM This paper presents research on integrating context-dependent durational knowledge into HMM-based speech recognition. The first part of the paper presents work on obtaining relations between the parameters of the context-free HMMs and their durational behaviour, in preparation for the context-dependent durational modelling presented in the second part. Duration integration is realised via rescoring in the post-processing step of our N-best monophone recogniser. We use the multi-speaker TIMIT database for our analyses. The single-state dpdf (which is geometrical) is less important than the dpdf of the whole HMM, because in actual practice it is the latter that models a phonetic segment. In this section we firstly derive the closed-form whole-model dpdf for general leftto-right HMM. Left-to-right is by far the most common type of transition topology used for speech recognition. It may include any number of skipping transitions and parallel paths but no feedback loops that contains more than one state. Then an analysis will be given of the general properties of the dpdf, with the help of some examples of useful topologies.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Allophone-based acoustic modeling for Persian phoneme recognition

Phoneme recognition is one of the fundamental phases of automatic speech recognition. Coarticulation which refers to the integration of sounds, is one of the important obstacles in phoneme recognition. In other words, each phone is influenced and changed by the characteristics of its neighbor phones, and coarticulation is responsible for most of these changes. The idea of modeling the effects o...

متن کامل

Suprasegmental duration modelling with elastic constraints in automatic speech recognition

In this paper a method of integrating a model of suprasegmental duration with a HMM-based recogniser at the post-processing level is presented. The N-Best utterance output is rescored using a suitable linear combination of acoustic log-likelihood (provided by a set of tied-state triphone HMMs) and duration log-likelihood (provided by a set of durational models). The durational model used in the...

متن کامل

Better HMM-Based Articulatory Feature Extraction with Context-Dependent Model

The majority of speech recognition systems today commonly use Hidden Markov Models (HMMs) as acoustic models in systems since they can powerfully train and map a speech utterance into a sequence of units. Such systems perform even better if the units are context-dependent. Analogously, when HMM techniques are applied to the problem of articulatory feature extraction, contextdependent articulato...

متن کامل

Modelling care of articulation with HMMs is dangerous

Changes in care of articulation (COA) affect both the spectral and durational characteristics of speech. This can have severe repercussions on both the success of speech recognition, and the quality of speech synthesis. Although auto-segmentation has proven useful for measuring the durational effects of COA, an automatic spectral measurement has proven more problematic [5]. In this paper, we wi...

متن کامل

Off-line Arabic Handwritten Recognition Using a Novel Hybrid HMM-DNN Model

In order to facilitate the entry of data into the computer and its digitalization, automatic recognition of printed texts and manuscripts is one of the considerable aid to many applications. Research on automatic document recognition started decades ago with the recognition of isolated digits and letters, and today, due to advancements in machine learning methods, efforts are being made to iden...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1996